KorAP Architecture ― Diving in the Deep Sea of Corpus Data

نویسندگان

  • Nils Diewald
  • Michael Hanl
  • Eliza Margaretha
  • Joachim Bingel
  • Marc Kupietz
  • Piotr Banski
  • Andreas Witt
چکیده

KorAP is a corpus search and analysis platform, developed at the Institute for the German Language (IDS). It supports very large corpora with multiple annotation layers, multiple query languages, and complex licensing scenarios. KorAP’s design aims to be scalable, flexible, and sustainable to serve the German Reference Corpus DEREKO for at least the next decade. To meet these requirements, we have adopted a highly modular microservice-based architecture. This paper outlines our approach: An architecture consisting of small components that are easy to extend, replace, and maintain. The components include a search backend, a user and corpus license management system, and a web-based user frontend. We also describe a general corpus query protocol used by all microservices for internal communications. KorAP is open source, licensed under BSD-2, and available on GitHub.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Access control by query rewriting: the case of KorAP

We present an approach to an aspect of managing complex access scenarios to large and heterogeneous corpora that involves handling user queries that, intentionally or due to the complexity of the queried resource, target texts or annotations outside of the given user’s permissions. We first outline the overall architecture of the corpus analysis platform KorAP, devoting some attention to the wa...

متن کامل

Using information retrieval technology for a corpus analysis platform

This paper describes a practical approach to use the information retrieval engine Lucene for the corpus analysis platform KorAP, currently being developed at the Institut für Deutsche Sprache (IDS Mannheim). It presents a method to use Lucene’s indexing technique and to exploit it for linguistically annotated data, allowing full flexibility to handle multiple annotation layers. It uses multiple...

متن کامل

Foraging energetics and diving behavior of lactating New Zealand sea lions, Phocarctos hookeri.

The New Zealand sea lion, Phocarctos hookeri, is the deepest- and longest-diving sea lion. We were interested in whether the diving ability of this animal was related to changes in its at-sea and diving metabolic rates. We measured the metabolic rate, water turnover and diving behavior of 12 lactating New Zealand sea lions at Sandy Bay, Enderby Island, Auckland Islands Group, New Zealand (50 de...

متن کامل

Activity and diving metabolism correlate in Steller sea lion Eumetopias jubatus

Three Steller sea lions Eumetopias jubatus were trained to participate in free-swimming, open-ocean experiments designed to determine if activity can be used to estimate the energetic cost of finding prey at depth. Sea lions were trained to dive to fixed depths of 10 to 50 m, and to re-surface inside a floating dome to measure energy expenditure via gas exchange. A 3-axis accelerometer was atta...

متن کامل

Deep-diving sea lions exhibit extreme bradycardia in long-duration dives.

Heart rate and peripheral blood flow distribution are the primary determinants of the rate and pattern of oxygen store utilisation and ultimately breath-hold duration in marine endotherms. Despite this, little is known about how otariids (sea lions and fur seals) regulate heart rate (fH) while diving. We investigated dive fH in five adult female California sea lions (Zalophus californianus) dur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016